{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Protein preparation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## What is protein preparation?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "The **protein preparation** phase, based on the PDB2PQR and propKa softwares, addresses e.g. the problems of assigning  titration states at the user-chosen pH; flipping the side chains of HIS, ASN, and GLN residues; and optimizing the overall hydrogen bonding network. \n",
    "\n",
    "After preparing, the **build** phase takes a prepared system and applies the chosen forcefield in order to obtain simulation-ready input files."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Let's start"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. \n",
      "https://dx.doi.org/10.1021/acs.jctc.6b00049\n",
      "Documentation: http://software.acellera.com/\n",
      "To update: conda update htmd -c acellera -c psi4\n",
      "\n",
      "You are on the latest HTMD version (unpackaged : /home/joao/maindisk/software/repos/Acellera/htmd/htmd).\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from htmd.ui import *\n",
    "config(viewer='ngl')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Protein Preparation in HTMD"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The system preparation phase is based on the PDB2PQR software. It \n",
    "includes the following steps (from the\n",
    "[PDB2PQR algorithm\n",
    "description](http://apbs-pdb2pqr.readthedocs.io/en/latest/pdb2pqr/invoking.html)):\n",
    "\n",
    " * Compute empirical pKa values for the residues' local environment (propKa)\n",
    " * Assign titration states at the user-chosen pH;\n",
    " * Flipping the side chains of HIS (including user defined HIS states), ASN, and GLN residues;\n",
    "\n",
    " * Rotating the sidechain hydrogen on SER, THR, TYR, and CYS (if available);\n",
    " * Determining the best placement for the sidechain hydrogen on neutral HIS, protonated GLU, and protonated ASP;\n",
    " * Optimizing all water hydrogens."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "The hydrogen bonding network calculations are performed by the\n",
    "[PDB2PQR](http://www.poissonboltzmann.org/) software package. The pKa\n",
    "calculations are performed by the [PROPKA\n",
    "3.1](https://github.com/jensengroup/propka-3.1) software packages.\n",
    "Please see the copyright, license  and citation terms distributed with each.\n",
    "\n",
    "Note that this version was modified in order to use an \n",
    "externally-supplied propKa **3.1** (installed automatically via dependencies), whereas\n",
    "the original had propKa 3.0 *embedded*!\n",
    "\n",
    "The results of the function should be roughly equivalent of the system\n",
    "preparation wizard's preprocessing and optimization steps\n",
    "of Schrodinger's Maestro software."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Protein residue pKas in water"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "![](http://pub.htmd.org/tutorials/protein-preparation/naming.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Modified residue names"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "The molecule produced by the preparation modifies residue names\n",
    "according to their protonation.\n",
    "Later system-building functions assume these residue naming conventions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Charge +1    |  Neutral   | Charge -1\n",
    "-------------|------------|----------\n",
    " -           |  ASH       | ASP\n",
    " -           |  CYS       | CYM\n",
    " -           |  GLH       | GLU\n",
    "HIP          |  HID/HIE   |  -\n",
    "LYS          |  LYN       |  -\n",
    " -           |  TYR       | TYM\n",
    "ARG          |  AR0       |  -"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note**: support for alternative charge states varies between the  forcefields."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Limitations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    " * *PDB2PQR*: returns **one** solution consistent with its restraints, not the optimal one;\n",
    " * *Membrane proteins*: propKa ignores **lipid exposure** (more on this later);\n",
    " * *Large conformational changes*: local environment changes may be large enough that pKa decisions are **not transferable**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## `proteinPrepare` function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "The `proteinPrepare` function requires a `Molecule` object, the protein to be prepared, as an argument, and returns the prepared system, also as a `Molecule`. Logging messages will provide information and warnings about the process.\n",
    "\n",
    "```python\n",
    "def proteinPrepare(mol_in,\n",
    "                   pH=7.0,\n",
    "                   verbose=0,\n",
    "                   returnDetails=False,\n",
    "                   hydrophobicThickness=None,\n",
    "                   holdSelection=None):\n",
    "```\n",
    "\n",
    "Returns a `Molecule` object, where residues have been renamed to follow internal conventions on protonation (below). Coordinates are changed to optimize the H-bonding network. This should be roughly comparable to Schroedinger Maestro's preparation wizard."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "```\n",
    "mol_in : htmd.Molecule\n",
    "    the object to be optimized\n",
    "pH : float\n",
    "    pH to decide titration\n",
    "verbose : int\n",
    "    verbosity\n",
    "returnDetails : bool\n",
    "    whether to return just the prepared Molecule (False, default) or a molecule *and* a ResidueInfo\n",
    "    object including computed properties\n",
    "hydrophobicThickness : float\n",
    "    the thickness of the membrane in which the protein is embedded, or None if globular protein.\n",
    "    Used to provide a warning about membrane-exposed residues.\n",
    "holdSelection : str\n",
    "    Atom selection to be excluded from optimization.\n",
    "    Only the carbon-alpha atom will be considered for the corresponding residue.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "`proteinPrepare()` is a convenience function. Using it\n",
    "is **not** mandatory. You can \n",
    "manipulate the input molecule with your custom functions. \n",
    "In particular,\n",
    "\n",
    "* Addition of hydrogen atoms is not required\n",
    "* Protonation states are set by renaming residues\n",
    "* HIS and other residues can be edited as coordinates\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Prepare trypsin (PDB: 3PTB) at pH 7."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2018-03-16 16:45:27,014 - htmd.molecule.readers - INFO - Using local copy for 3PTB: /home/joao/maindisk/software/repos/Acellera/htmd/htmd/data/pdb/3ptb.pdb\n",
      "2018-03-16 16:45:28,394 - htmd.molecule.molecule - WARNING - Residue insertions were detected in the Molecule. It is recommended to renumber the residues using the Molecule.renumberResidues() method.\n",
      "2018-03-16 16:45:28,556 - propka - INFO - No pdbfile provided\n",
      "2018-03-16 16:45:31,188 - htmd.builder.preparation - WARNING - The following residue has not been optimized: CA\n",
      "2018-03-16 16:45:31,189 - htmd.builder.preparation - WARNING - The following residue has not been optimized: BEN\n",
      "2018-03-16 16:45:39,499 - htmd.builder.preparationdata - INFO - The following residues are in a non-standard state: CYS    22  A (CYX), HIS    40  A (HIE), CYS    42  A (CYX), HIS    57  A (HIP), CYS    58  A (CYX), HIS    91  A (HID), CYS   128  A (CYX), CYS   136  A (CYX), CYS   157  A (CYX), CYS   168  A (CYX), CYS   182  A (CYX), CYS   191  A (CYX), CYS   201  A (CYX), CYS   220  A (CYX), CYS   232  A (CYX)\n",
      "2018-03-16 16:45:39,935 - htmd.builder.preparationdata - WARNING - Dubious protonation state: the pKa of 3 residues is within 1.0 units of pH 7.0.\n",
      "2018-03-16 16:45:39,939 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    HIS    57  A (pKa= 7.44)\n",
      "2018-03-16 16:45:39,940 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    GLU    70  A (pKa= 6.10)\n",
      "2018-03-16 16:45:39,942 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    N+     16T A (pKa= 7.41)\n",
      "2018-03-16 16:45:40,042 - htmd.builder.preparationdata - WARNING - Found N-terminus 83.9% buried (> 50.0% threshold)\n",
      "2018-03-16 16:45:40,043 - htmd.builder.preparationdata - WARNING - Found C-terminus involved in H bonds\n"
     ]
    }
   ],
   "source": [
    "tryp = Molecule(\"3PTB\")\n",
    "tryp_op = proteinPrepare(tryp)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Visualize protonation of residue 40"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "cfa19ea4a2074c09b0dcfd45f37b8475",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "A Jupyter Widget"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "tryp_op.view(style=\"Licorice\",sel=\"resid 40\",hold=True)\n",
    "tryp_op.view(style=\"Lines\",sel=\"same residue as exwithin 4 of resid 40\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Preparation report"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "If the `returnDetails` argument is set,  an object of type `ResidueData` is returned as a **second** return value. It carries a wealth of information on the preparation results. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2018-03-16 16:46:21,332 - htmd.builder.preparation - WARNING - The following residue has not been optimized: CA\n",
      "2018-03-16 16:46:21,334 - htmd.builder.preparation - WARNING - The following residue has not been optimized: BEN\n",
      "2018-03-16 16:46:29,334 - htmd.builder.preparationdata - INFO - The following residues are in a non-standard state: CYS    22  A (CYX), HIS    40  A (HIE), CYS    42  A (CYX), HIS    57  A (HIP), CYS    58  A (CYX), HIS    91  A (HID), CYS   128  A (CYX), CYS   136  A (CYX), CYS   157  A (CYX), CYS   168  A (CYX), CYS   182  A (CYX), CYS   191  A (CYX), CYS   201  A (CYX), CYS   220  A (CYX), CYS   232  A (CYX)\n",
      "2018-03-16 16:46:29,337 - htmd.builder.preparationdata - WARNING - Dubious protonation state: the pKa of 3 residues is within 1.0 units of pH 7.0.\n",
      "2018-03-16 16:46:29,339 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    HIS    57  A (pKa= 7.44)\n",
      "2018-03-16 16:46:29,340 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    GLU    70  A (pKa= 6.10)\n",
      "2018-03-16 16:46:29,341 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    N+     16T A (pKa= 7.41)\n",
      "2018-03-16 16:46:29,433 - htmd.builder.preparationdata - WARNING - Found N-terminus 83.9% buried (> 50.0% threshold)\n",
      "2018-03-16 16:46:29,434 - htmd.builder.preparationdata - WARNING - Found C-terminus involved in H bonds\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "PreparationData object about 290 residues.\n",
       "Unparametrized residue names: CA, BEN\n",
       "Please find the full info in the .data property, e.g.: \n",
       "  resname  resid insertion chain       pKa protonation flipped     buried\n",
       "0     ILE     16               A       NaN         ILE     NaN        NaN\n",
       "1     VAL     17               A       NaN         VAL     NaN        NaN\n",
       "2     GLY     18               A       NaN         GLY     NaN        NaN\n",
       "3     GLY     19               A       NaN         GLY     NaN        NaN\n",
       "4     TYR     20               A  9.590845         TYR     NaN  14.642857\n",
       " . . ."
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tryp_op, prepData = proteinPrepare(tryp, returnDetails=True)\n",
    "prepData"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Most of it is accessible in the `data` property, as a [pandas DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['resname', 'resid', 'insertion', 'chain', 'pKa', 'protonation',\n",
       "       'flipped', 'patches', 'buried', 'z', 'membraneExposed',\n",
       "       'forced_protonation', 'default_protonation', 'pka_group_id',\n",
       "       'pka_residue_type', 'pka_type', 'pka_charge', 'pka_atom_name',\n",
       "       'pka_atom_sybyl_type', 'pdb2pqr_idx', 'guessedAtoms'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prepData.data.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>resname</th>\n",
       "      <th>resid</th>\n",
       "      <th>pKa</th>\n",
       "      <th>protonation</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ILE</td>\n",
       "      <td>16</td>\n",
       "      <td>NaN</td>\n",
       "      <td>ILE</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>VAL</td>\n",
       "      <td>17</td>\n",
       "      <td>NaN</td>\n",
       "      <td>VAL</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>GLY</td>\n",
       "      <td>18</td>\n",
       "      <td>NaN</td>\n",
       "      <td>GLY</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>GLY</td>\n",
       "      <td>19</td>\n",
       "      <td>NaN</td>\n",
       "      <td>GLY</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>TYR</td>\n",
       "      <td>20</td>\n",
       "      <td>9.590845</td>\n",
       "      <td>TYR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>THR</td>\n",
       "      <td>21</td>\n",
       "      <td>NaN</td>\n",
       "      <td>THR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>CYS</td>\n",
       "      <td>22</td>\n",
       "      <td>99.990000</td>\n",
       "      <td>CYX</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>GLY</td>\n",
       "      <td>23</td>\n",
       "      <td>NaN</td>\n",
       "      <td>GLY</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>ALA</td>\n",
       "      <td>24</td>\n",
       "      <td>NaN</td>\n",
       "      <td>ALA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>ASN</td>\n",
       "      <td>25</td>\n",
       "      <td>NaN</td>\n",
       "      <td>ASN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  resname  resid        pKa protonation\n",
       "0     ILE     16        NaN         ILE\n",
       "1     VAL     17        NaN         VAL\n",
       "2     GLY     18        NaN         GLY\n",
       "3     GLY     19        NaN         GLY\n",
       "4     TYR     20   9.590845         TYR\n",
       "5     THR     21        NaN         THR\n",
       "6     CYS     22  99.990000         CYX\n",
       "7     GLY     23        NaN         GLY\n",
       "8     ALA     24        NaN         ALA\n",
       "9     ASN     25        NaN         ASN"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prepData.data.loc[:,['resname','resid','pKa','protonation']].head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "As such, it can be easily queried and written as a spreadsheet in Excel or CSV format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "prepData.data.to_excel(\"./tryp_data.xlsx\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Special case: Membrane proteins"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Membrane-embedded proteins are in contact with an hydrophobic region which may alter pKa values for membrane-exposed residues (image taken from [Teixeira et al., J. Chem. Theory Comput., 2016, 12 (3), pp 930–934](http://dx.doi.org/10.1021/acs.jctc.5b01114))."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "<center>![](http://pubs.acs.org/appl/literatum/publisher/achs/journals/content/jctcce/2016/jctcce.2016.12.issue-3/acs.jctc.5b01114/20160302/images/large/ct-2015-01114c_0002.jpeg)</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Although the effect is not currently taken into account quantitatively, if a `hydrophobicThickness` argument is provided, warnings will be generated for residues exposed to the lipid region."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "The following example shows the preparation of the mu opioid receptor, 4DKL. \n",
    "The **pre-oriented** structure is retrieved  from the OPM database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2018-03-16 16:47:01,783 - htmd.molecule.molecule - INFO - Removed 2546 atoms. 4836 atoms remaining in the molecule.\n",
      "2018-03-16 16:47:01,876 - htmd.molecule.molecule - INFO - Removed 364 atoms. 4472 atoms remaining in the molecule.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "32.0\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2018-03-16 16:47:29,115 - htmd.builder.preparationdata - INFO - The following residues are in a non-standard state: ASP   114  A (ASH), CYS   140  A (CYX), HIS   171  A (HID), CYS   217  A (CYX), HIS   223  A (HID), HIS   297  A (HID), HIS   319  A (HIE), ASP   114  B (ASH), CYS   140  B (CYX), HIS   171  B (HID), CYS   217  B (CYX), HIS   223  B (HID), HIS   297  B (HID), HIS   319  B (HIE)\n",
      "2018-03-16 16:47:29,118 - htmd.builder.preparationdata - WARNING - Dubious protonation state: the pKa of 6 residues is within 1.0 units of pH 7.0.\n",
      "2018-03-16 16:47:29,119 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    ASP   114  A (pKa= 7.85)\n",
      "2018-03-16 16:47:29,121 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    HIS   223  A (pKa= 6.36)\n",
      "2018-03-16 16:47:29,122 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    ASP   114  B (pKa= 7.85)\n",
      "2018-03-16 16:47:29,123 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    HIS   223  B (pKa= 6.36)\n",
      "2018-03-16 16:47:29,124 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    N+     65T A (pKa= 7.76)\n",
      "2018-03-16 16:47:29,125 - htmd.builder.preparationdata - WARNING - Dubious protonation state:    N+     65T B (pKa= 7.76)\n",
      "2018-03-16 16:47:29,342 - htmd.builder.preparationdata - WARNING - Predictions for 34 residues may be incorrect because they are exposed to the membrane (-16.0<z<16.00 and buried<75.0%).\n"
     ]
    }
   ],
   "source": [
    "from htmd.util import opm\n",
    "mor, thickness = opm(\"4dkl\") \n",
    "print(thickness)\n",
    "mor.filter(\"protein and noh\")\n",
    "mor_opt, mor_data = proteinPrepare(mor, returnDetails=True,\n",
    "                                   hydrophobicThickness=thickness)\n",
    "\n",
    "exposedRes = mor_data.data.membraneExposed\n",
    "mor_data.data[exposedRes]\n",
    "mor_data.data[exposedRes].to_excel(\"./mor_exposed_residues.xlsx\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Case 2. Building with a ligand\n",
    "\n",
    "Coexistence and automatic placement of a ligand requires further manipulation,\n",
    "because:\n",
    "\n",
    "1. The ligand may have to be arranged in a geometrically sensible way\n",
    "2. We likely need additional parameters and topologies\n",
    "\n",
    "See the tutorial [System Building Trypsin-Benzamidine](https://software.acellera.com/docs/latest/htmd/tutorials/system-building-protein-ligand.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Case 3. Membrane proteins\n",
    "\n",
    "Pre-equilibrated membranes can be merged with pre-oriented systems,\n",
    "e.g. downloaded from the OPM. See the tutorial [System Building μ-opioid Receptor in Membrane](https://software.acellera.com/docs/latest/htmd/tutorials/system-building-protein-in-membrane.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Citations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Please acknowledge your use of PDB2PQR and PropKa by citing:\n",
    "\n",
    "*   Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH, Klebe G, Baker NA. PDB2PQR: Expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res, 35, W522-5, 2007. \n",
    "*   Sondergaard, Chresten R., Mats HM Olsson, Michal Rostkowski, and Jan H. Jensen. \"Improved Treatment of Ligands and Coupling Effects in Empirical Calculation and Rationalization of pKa Values.\" Journal of Chemical Theory and Computation 7, no. 7 (2011): 2284-2295."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  },
  "widgets": {
   "state": {
    "97d97c883f3143d9b5c0d6e4341c6b21": {
     "views": [
      {
       "cell_index": 22
      }
     ]
    },
    "ca44adb71e664d59b619bf10f8c210b9": {
     "views": [
      {
       "cell_index": 22
      }
     ]
    }
   },
   "version": "1.2.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
